Python: Improve prompt-template msg serialize and sample usage by moonbox3 · Pull Request #13738 · microsoft/semantic-kernel

moonbox3 · 2026-04-06T03:13:31Z

Motivation and Context

This PR updates the Jinja2 and Handlebars prompt-template helpers to serialize chat messages through the existing XML/message serializer instead of assembling message XML manually. It also aligns the prompt-template samples with the serializer-backed helper and adds regression coverage for common message content.

Description

Prompt template message serialize improvements.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

Copilot

Pull request overview

This pull request improves Python prompt-template message serialization by using XML element serialization (instead of manual string concatenation) for Jinja2 and Handlebars message helpers, updates related samples to use serializer-backed helpers, and adds regression tests to ensure correct escaping and round-tripping of chat history with XML metacharacters.

Changes:

Update Jinja2 and Handlebars message helpers to build message XML via ElementTree serialization to correctly escape XML metacharacters.
Align Azure chat prompt-template samples to use message_to_prompt for serializer-backed message rendering.
Add unit and end to end tests covering escaping and chat history round-trip behavior, including system message preservation.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
python/semantic_kernel/prompt_template/utils/jinja2_system_helpers.py	Serialize `message()` output via XML element serialization to escape metacharacters.
python/semantic_kernel/prompt_template/utils/handlebars_system_helpers.py	Serialize `message` block helper output via XML element serialization to escape metacharacters.
python/tests/unit/prompt_template/test_jinja2_prompt_template.py	Add regression test validating escaping and round-trip parsing for `message()` helper.
python/tests/unit/prompt_template/test_jinja2_prompt_template_e2e.py	Add end to end tests for round-trip with metacharacters and system role preservation.
python/tests/unit/prompt_template/test_handlebars_prompt_template.py	Add regression test validating escaping and round-trip parsing for `message` block helper.
python/tests/unit/prompt_template/test_handlebars_prompt_template_e2e.py	Add end to end tests for round-trip with metacharacters and system role preservation.
python/samples/concepts/prompt_templates/azure_chat_gpt_api_jinja2.py	Update sample template to use `message_to_prompt(item)` for serializer-backed rendering.
python/samples/concepts/prompt_templates/azure_chat_gpt_api_handlebars.py	Update sample template to use `{{message_to_prompt}}` for serializer-backed rendering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/tests/unit/prompt_template/test_jinja2_prompt_template_e2e.py

python/tests/unit/prompt_template/test_handlebars_prompt_template_e2e.py

github-actions

Automated Code Review

Reviewers: 3 | Confidence: 89%

✓ Security Reliability

This PR is a security improvement that replaces manual f-string XML construction with xml.etree.ElementTree for generating XML tags in both Handlebars and Jinja2 prompt template helpers. Previously, user content containing XML metacharacters (<, >, &, ") was injected verbatim into XML strings, which could corrupt the XML structure and cause incorrect message reconstruction via ChatHistory.from_rendered_prompt(). The fix uses Element/tostring for proper escaping, consistent with how the rest of the codebase (ChatMessageContent.to_element, ChatHistory.from_rendered_prompt) already handles XML. The use of xml.etree.ElementTree for serialization-only (not parsing) is safe and matches the existing nosec B405 pattern in 12+ other files; defusedxml is correctly used elsewhere for parsing. The sample files are updated to demonstrate the message_to_prompt helper. Test coverage is thorough, including round-trip tests that verify ChatHistory.from_rendered_prompt correctly recovers the original messages after escaping.

✓ Test Coverage

This PR switches the handlebars and jinja2 _message helpers from manual XML string concatenation to xml.etree.ElementTree, properly escaping XML metacharacters in message content and attribute values. The test additions are well-structured: they verify specific escape sequences (<, &, unescaped " in text), and critically verify round-trip fidelity via ChatHistory.from_rendered_prompt. Both template engines (handlebars and jinja2) have symmetric unit and e2e tests. Minor gaps: no test covers None/empty content in _message helpers (the jinja2 helper previously rendered None as the literal string "None" — now fixed but untested), and no test exercises attribute-value escaping in the handlebars _message block helper kwargs path. These are non-blocking.

✓ Design Approach

The PR correctly fixes an XML injection bug in both the Handlebars and Jinja2 _message helpers by switching from manual string concatenation to xml.etree.ElementTree.Element + tostring(), which properly escapes XML metacharacters in content. The approach is sound and the # nosec B405 annotations are appropriate (we are generating XML, not parsing untrusted input). The sample changes steering users toward message_to_prompt are a good idea since it uses to_prompt() / to_element() for full-fidelity serialization. However, the jinja2 _message(item) helper still only serializes item.content (the first TextContent text), silently dropping image items, function-call items, and any other non-text items from the message. Its sibling _message_to_prompt avoids this by delegating to item.to_prompt() / to_element(). Since _message(item) accepts a full ChatMessageContent, calers reasonably expect full serialization — yet the implementation silently produces an incomplete representation. The better approach for _message in jinja2 would be to call item.to_element() and use tostring() on the resulting element, exactly as to_prompt() does, rather than manually setting only the text field.

Suggestions

The Jinja2 _message(item) helper sets message.text = item.content, which only captures the first TextContent's text and silently drops non-text items (images, function calls, etc.). Additionally, item.content could be a non-string type, and ElementTree.tostring behavior with non-strings is implementation-defined. The better approach is to delegate to item.to_element() and call tostring() on that—the same way _message_to_prompt works via item.to_prompt()—which would avoid data loss for multi-modal messages and eliminate the type-safety concern.
Add a test for None/empty content in the Jinja2 _message helper—the old code rendered None as the literal string "None", and the new code correctly produces an empty element, but this fix is untested. A round-trip test with a ChatMessageContent that has no text items (e.g., a FunctionCallContent-only message) through {{ message(item) }} would lock this in.
Consider adding a test for attribute values containing XML metacharacters in the Handlebars _message block helper (e.g., {{#message role=role custom_attr=value}}...{{/message}} where value contains < or "). The Element.set() call correctly escapes these, but there's no test verifying it.
The Handlebars sample (azure_chat_gpt_api_handlebars.py) switched from {{#message role=role}}{{~content~}}{{/message}} to {{message_to_prompt}}, which changes the XML serialization format (from bare text to <text> child elements). While both formats round-trip correctly, this sample change is orthogonal to the XML-escaping fix and could be confusing to users who reference the old sample pattern.
The Handlebars _message helper now sets message.text = str(options['fn'](this)), treating the entire block output as plain text. For users with allow_dangerously_set_content=True who previously relied on embedding raw XML sub-structure (e.g., image or function-call tags) directly in the block, this is a silent behavior change. It would be worth documenting that _message now always produces plain-text message content, and that message_to_prompt should be used for structured/multi-modal content.

Automated review by moonbox3's agents

python/semantic_kernel/prompt_template/utils/jinja2_system_helpers.py

- Use public re-export for AuthorRole (from semantic_kernel.contents) in test_jinja2_prompt_template_e2e.py and test_handlebars_prompt_template_e2e.py - Replace manual XML construction in _message() with item.to_element() to properly serialize all content items (images, function calls, etc.) - Remove unused Enum and Element imports from jinja2_system_helpers.py - Update test assertion to match correct to_element() output format Co-authored-by: Copilot <[email protected]>

chetantoshniwal

Automated Code Review

Reviewers: 4 | Confidence: 87%

✓ Correctness

This PR fixes XML metacharacter escaping in both Handlebars and Jinja2 prompt template message helpers by switching from manual string concatenation to xml.etree.ElementTree for XML construction. The Handlebars _message helper now uses Element/tostring to properly escape content and attributes. The Jinja2 _message helper delegates to item.to_element()+tostring(), which changes its output format (content is now wrapped in child elements). Samples are updated to prefer message_to_prompt, and comprehensive tests verify round-trip correctness with XML metacharacters. The changes are logically correct and the tests properly validate the new escaping behavior.

✓ Security Reliability

This PR is a clear security improvement that fixes XML injection vulnerabilities in both the Handlebars and Jinja2 prompt template helpers. Previously, user-controlled content (message text, role values) was directly interpolated into XML strings via f-strings without escaping, allowing XML metacharacters like <, &, and " to break the XML structure or inject malicious content. The fix correctly switches to xml.etree.ElementTree.Element and tostring() for proper XML serialization with automatic escaping. The # nosec B405 suppression is appropriate since ElementTree is used only for XML generation, not parsing (where XXE risks exist). The new tests thoroughly verify escaping and round-trip fidelity. No blocking security or reliability issues found.

✓ Test Coverage

The PR properly escapes XML metacharacters in both the Handlebars and Jinja2 _message helpers by switching to xml.etree.ElementTree for XML generation. Test coverage is good overall: new unit tests and e2e tests verify escaping of <, &, and ", plus round-trip fidelity via ChatHistory.from_rendered_prompt. However, there are a few gaps: (1) no test covers the empty-content path added in the Handlebars helper (if content: message.text = content combined with short_empty_elements=False), (2) the Handlebars and Jinja2 _message helpers now produce structurally different XML (flat text vs <text> sub-elements via to_element()), and there is no test that validates both round-trip correctly for the same input, and (3) the > character is not tested even though it is an XML metacharacter.

✗ Design Approach

The PR correctly addresses XML injection by switching from f-string XML concatenation to xml.etree.ElementTree, which properly escapes metacharacters. However, it introduces a structural inconsistency between the two template-engine paths: the Jinja2 _message now delegates to item.to_element() (which wraps content in a <text> child element, producing <message role=".."><text>…</text></message>), while the Handlebars _message block helper still builds the element manually and sets message.text = content (producing the flat <message role="...">…</message> form). Because from_rendered_prompt must handle both formats and the serialization protocol is the only contract between the renderer and the parser, having two divergent XML shapes for the same logical object is a fragile, leaky design. The right fix is to unify both paths through item.to_element() / a shared serialisation helper, not to fix one engine while leaving the other inconsistent.

Automated review by chetantoshniwal's agents

python/semantic_kernel/prompt_template/utils/handlebars_system_helpers.py

python/tests/unit/prompt_template/test_handlebars_prompt_template.py

python/tests/unit/prompt_template/test_jinja2_prompt_template_e2e.py

…format The handlebars _message helper was producing <message role="...">content</message> (direct text node) while the Jinja2 path produced <message role="..."><text>content</text></message> via to_element(). When the context is a ChatMessageContent, delegate to to_element() so the XML contract is consistent across template engines. Keep the manual Element construction as a fallback for non-ChatMessageContent contexts. Add tests for: - Consistent <text> wrapper in handlebars output - Empty message content (no self-closing tags) - Greater-than character handling and round-trip - Backward compatibility of from_rendered_prompt with old format Co-authored-by: Copilot <[email protected]>

…erage - Use <text> child element in handlebars fallback path instead of direct text node, matching the Jinja2/to_element() XML contract - Add > to XML metacharacter escape test - Add tests for fallback path (non-ChatMessageContent context) with both empty and non-empty block content - Add backward compatibility test verifying from_rendered_prompt parses the old format without <text> wrapper Co-authored-by: Copilot <[email protected]>

Improve prompt-template msg serialize and sample usage

8bd2832

moonbox3 self-assigned this Apr 6, 2026

Copilot AI review requested due to automatic review settings April 6, 2026 03:13

moonbox3 requested a review from a team as a code owner April 6, 2026 03:13

Copilot started reviewing on behalf of moonbox3 April 6, 2026 03:14 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

python/tests/unit/prompt_template/test_jinja2_prompt_template_e2e.py Outdated Show resolved Hide resolved

python/tests/unit/prompt_template/test_handlebars_prompt_template_e2e.py Outdated Show resolved Hide resolved

github-actions bot reviewed Apr 6, 2026

View reviewed changes

python/semantic_kernel/prompt_template/utils/jinja2_system_helpers.py Show resolved Hide resolved

MAF Dashboard Bot and others added 2 commits April 6, 2026 04:25

Address review feedback for microsoft#13738: review comment fixes

6752fc5

moonbox3 enabled auto-merge April 6, 2026 04:37

chetantoshniwal reviewed Apr 7, 2026

View reviewed changes

MAF Dashboard Bot and others added 3 commits April 7, 2026 01:08

Address review feedback for microsoft#13738: review comment fixes

abcec4b

moonbox3 requested a review from chetantoshniwal April 7, 2026 01:50

eavanvalkenburg approved these changes Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Improve prompt-template msg serialize and sample usage#13738

Python: Improve prompt-template msg serialize and sample usage#13738
moonbox3 wants to merge 6 commits intomicrosoft:mainfrom
moonbox3:jinja2-improvements

moonbox3 commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

chetantoshniwal left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

moonbox3 commented Apr 6, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Security Reliability

✓ Test Coverage

✓ Design Approach

Suggestions

Uh oh!

Uh oh!

chetantoshniwal left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants